perm filename CHAP4[4,KMC]24 blob
sn#064008 filedate 1973-09-25 generic text, type T, neo UTF8
00100 LANGUAGE-RECOGNITION PROCESSES FOR UNDERSTANDING DIALOGUES
00200 IN TELETYPED PSYCHIATRIC INTERVIEWS
00300
00400 Since the behavior being simulated by this paranoid model is
00500 the sequential language-behavior of a paranoid patient in a
00600 psychiatric interview, the model (PARRY) must have an ability to
00700 interpret and respond to natural language input to a degree
00800 sufficient to demonstrate conduct characteristic of the paranoid
00900 mode. By "natural language" I shall mean ordinary American
01000 English such as is used in everyday conversations. It is still
01100 difficult to be explicit about the processes which enable humans to
01200 interpret and respond to natural language. ("A mighty maze ! but
01300 not without a plan." - A. Pope). Philosophers, linguists and
01400 psychologists have investigated natural language with various
01500 purposes. Few of the results have been useful to builders of
01600 interactive simulation models. Attempts have been made in artificial
01700 intelligence to write algorithims which "understand" teletyped
01800 natural language expressions. (Colby and Enea,1967; Enea and
01900 Colby,1973; Schank, Goldman, Rieger, and Riesbeck,1973;
02000 Winograd,1973; Woods, 1970). Computer understanding of natural
02100 language is actively being attempted today but it is not something to
02200 be completly achieved today or even tomorrow. For our model the
02300 problem at the moment was not to find immediately the best way of
02400 doing it but to find any way at all.
02500 During the 1960's when machine processing of natural language
02600 was dominated by syntactic considerations, it became clear that
02700 syntactical information alone was insufficient to comprehend the
02800 expressions of ordinary conversations. A current view is that to
02900 understand what information is contained in linguistic expressions,
03000 knowledge of syntax and semantics must be combined with beliefs from
03100 a conceptual structure capable of making inferences. How to achieve
03200 this combination efficiently with a large data-base represents a
03300 monumental task for both theory and implementation.
03400 For performance reasons we did not attempt to construct a
03500 conventional linguistic parser to analyze conversational language of
03600 interviews. Parsers to date have had great difficulty in performing
03700 well enough to assign a meaningful interpretation to the expressions
03800 of everyday conversational language in unrestricted English. Purely
03900 syntactic parsers offer a cancerous proliferation of interpretations.
04000 A conventional parser, lacking neglecting and ignoring mechanisms,
04100 may simply halt when it comes across a word not in its dictionary.
04200 Parsers represent tight conjunctions of tests instead of loose
04300 disjunctions needed for gleaning some degree of meaning from everyday
04400 language communication. It is easily observed that people
04500 misunderstand and ununderstand at times and thus remain partially
04600 opaque to one another, a truth which lies at the core of human life
04700 and communication.
04800 How language is understood depends on how people interpret
04900 the meanings of situations they find themselves in. In a dialogue,
05000 language is understood in accordance with a participant's view of the
05100 situation. The participants are interested in both what an utterance
05200 means (what it refers to) and what the utterer means ( his
05300 intentions). In a first psychiatric interview the doctor's intention
05400 is to gather certain kinds of information; the patient's intention is
05500 to give information in order to receive help. Such an interview is
05600 not small talk; a job is to be done. Our purpose was to develop a
05700 method for recognizing sequences of everyday English sufficient for
05800 the model to communicate linguistically in a paranoid way in the
05900 circumscribed situation of a psychiatric interview.
06000 We did not try to construct a general-purpose algorithm which
06100 could understand anything said in English by anybody to anybody else
06200 in any dialogue situation. (Does anyone believe it to be currently
06300 possible? The seductive myth of generalization can lead to
06400 trivialization). We sought simply to extract some degree of, or
06500 partial idiosyncratic, idiolectic meaning (not the "complete"
06600 meaning, whatever that means) from the input. We utilized a
06700 pattern-directed, rather than a parsing-directed, approach because of
06800 the former's power to ignore irrelevant and unintelligible details.
06900 Natural language is not an agreed-upon universe of discourse
07000 such as arithmetic, wherein symbols have a fixed meaning for everyone
07100 who uses them. What we loosely call "natural language" is actually a
07200 set of history-dependent, selective, and interest-oriented idiolects,
07300 each being unique to the individual with a unique history. (To be
07400 unique does not mean that no property is shared with other
07500 individuals, only that not every property is shared). It is the broad
07600 overlap of idiolects which allows the communication of shared
07700 meanings in everyday conversation.
07800 We took as pragmatic measures of "understanding" the ability
07900 (1) to form a conceptualization so that questions can be answered and
08000 commands carried out, (2) to determine the intention of the
08100 interviewer, (3) to determine the references for pronouns and other
08200 anticipated topics. This straightforward approach to a complex
08300 problem has its drawbacks, as will be shown. We strove for a highly
08400 individualized idiolect sufficient to demonstrate paranoid processes
08500 of an individual in a particular situation rather than for a general
08600 supra-individual or ideal comprehension of English. If the
08700 language-recognition processes of PARRY were to interfere with
08800 demonstrating the paranoid processes, we would consider them
08900 defective and insufficient for our purposes.
09000 The language-recognition process utilized by PARRY first puts
09100 the teletyped input in the form of a list and then determines the
09200 syntactic type of the input expression - question, statement or
09300 imperative by looking at introductory terms and at punctuation. The
09400 expression-type is then scanned for conceptualizations, i.e. patterns
09500 of contentives consisting of words or word-groups, stress-forms of
09600 speech having conceptual meaning relevant to the model's interests.
09700 The search for conceptualizations ignores (as irrelevant details)
09800 function or closed-class terms (articles, auxiliaries, conjunctions,
09900 prepositions, etc.) except as they might represent a component in a
10000 contentive word-group. For example, the word-group (for a living) is
10100 defined to mean `work' as in "Wat do you do for a living?" The
10200 conceptualization is classified according to the rules of Fig. 1 as
10300 malevolent, benevolent or neutral. Thus PARRY attempts to judge the
10400 intention of the utterer from the content of the utterance.
10500 (INSERT FIG.1 HERE)
10600 Some special problems a dialogue algorithm must handle in a
10700 psychiatric interview will now be outlined along with a brief
10800 description of how the model deals with them.
10900
11000 QUESTIONS
11100
11200 The principal expression-type used by an interviewer consists
11300 of a question. A question is recognized by its first term being a
11400 "wh-" or "how" form and/or an expression ending with a question-mark.
11500 In teletyped interviews a question may sometimes be put in
11600 declarative form followed by a question mark as in:
11700 (1) PT.- I LIKE TO GAMBLE ON THE HORSES.
11800 (2) DR.- YOU GAMBLE?
11900 Although a question-word or auxiliary verb is missing in (2), the
12000 model recognizes that a question is being asked about its gambling
12100 simply by the question mark.
12200 Particularly difficult are those `when' questions which
12300 require a memory which can assign each event a beginning, an end and
12400 a duration. An improved version of the model should have this
12500 capacity. Also troublesome are questions such as `how often', `how
12600 many', i.e. a `how' followed by a quantifier. If the model has "how
12700 often" on its expectancy list while a topic is under discussion, the
12800 appropriate reply can be made. Otherwise the model fails to
12900 understand.
13000 In constructing a simulation of symbolic processes it is
13100 arbitrary how much information to represent in the data-base, Should
13200 PARRY know what is the capital of Alabama? It is trivial to store
13300 tomes of facts and there always will be boundary conditions. We took
13400 the position that the model should know only what we believed it
13500 reasonable to know relative to a few hundred topics expectable in a
13600 psychiatric interview. Thus PARRY performs poorly when subjected to
13700 baiting `exam' questions designed to test its informational
13800 limitations rather than to seek useful psychiatric information.
13900
14000 IMPERATIVES
14100
14200 Typical imperatives in a psychiatric interview consist of
14300 expressions like:
14400 (3) DR.- TELL ME ABOUT YOURSELF.
14500 (4) DR.- LETS DISCUSS YOUR FAMILY.
14600 Such imperatives are actually interrogatives to the
14700 interviewee about the topics they refer to. Since the only physical
14800 action the model can perform is to `talk' , imperatives are treated
14900 as requests for information. They are identified by the common
15000 introductory phrases: "tell me", "lets talk about", etc.
15100 DECLARATIVES
15200
15300 In this category is lumped everything else. It includes
15400 greetings, farewells, yes-no type answers, existence assertions and
15500 the usual predications.
15600
15700 AMBIGUITIES
15800
15900 Words have more than one sense, a convenience for human
16000 memories but a struggle for language-understanding algorithms.
16100 Consider the word "bug" in the following expressions:
16200 (5) AM I BUGGING YOU?
16300 (6) AFTER A PERIOD OF HEAVY DRINKING HAVE YOU FELT BUGS ON
16400 YOUR SKIN?
16500 (7) DO YOU THINK THEY PUT A BUG IN YOUR ROOM?
16600 In expression (5) the term "bug" means to annoy, in (6) it
16700 refers to an insect and in (7) it refers to a microphone used for
16800 hidden surveillence. PARRY uses context to carry out
16900 disambiguation. For example, when the Mafia is under discussion and
17000 the affect-variable of fear is high, the model interprets "bug" to
17100 mean microphone. In constructing this hypothetical individual we
17200 took advantage of the selective nature of idiolects which can have an
17300 arbitrary restriction on word senses. One characteristic of the
17400 paranoid mode is that regardless of what sense of a word the the
17500 interviewer intends, the patient may idiosyncratically interpret it
17600 as some sense of his own. This property is obviously of great help
17700 for an interactive simulation with limited language-understanding
17800 abilities.
17900 ANAPHORIC REFERENCES
18000 The common anaphoric references consist of the pronouns "it",
18100 "he", "him", "she", "her", "they", "them" as in:
18200 (8) PT.-HORSERACING IS MY HOBBY.
18300 (9) DR.-WHAT DO YOU ENJOY ABOUT IT?
18400 When a topic is introduced by the patient as in (8), a
18500 number of things can be expected to be asked about it. Thus the
18600 algorithm has ready an updated expectancy-anaphora list which allows
18700 it to determine whether the topic introduced by the model is being
18800 responded to or whether the interviewer is continuing with the
18900 previous topic.
19000 The algorithm recognizes "it" in (9) as referring to
19100 "horseracing" because a flag for horseracing was set when horseracing
19200 was introduced in (8), "it" was placed on the expected anaphora list,
19300 and no new topic has been introduced. A more difficult problem arises
19400 when the anaphoric reference points more than one I-O pair back in
19500 the dialogue as in:
19600 (10) PT.-THE MAFIA IS OUT TO GET ME.
19700 (11) DR.- ARE YOU AFRAID OF THEM?
19800 (12) PT.- MAYBE.
19900 (13) DR.- WHY IS THAT?
20000 The "that" of expression (13) does not refer to (12) but to
20100 the topic of being afraid which the interviewer introduced in (11).
20200 Another pronominal confusion occurs when the interviewer uses
20300 `we' in two senses as in:
20400 (14) DR.- WE WANT YOU TO STAY IN THE HOSPITAL.
20500 (15) PT.- I WANT TO BE DISCHARGED NOW.
20600 (16) DR.- WE ARE NOT COMMUNICATING.
20700 In expression (14) the interviewer is using "we" to refer to
20800 psychiatrists or the hospital staff while in (16) the term refers to
20900 the interviewer and patient. Identifying the correct referent would
21000 require beliefs about the dialogue itself.
21100
21200 TOPIC SHIFTS
21300
21400 In the main, a psychiatric interviewer is in control of the
21500 interview. When he has gained sufficient information about a topic,
21600 he shifts to a new topic. Naturally the algorithm must detect this
21700 change of topic as in the following:
21800 (17) DR.- HOW DO YOU LIKE THE HOSPITAL?
21900 (18) PT.- ITS NOT HELPING ME TO BE HERE.
22000 (19) DR.- WHAT BROUGHT YOU TO THE HOSPITAL?
22100 (20) PT.- I AM VERY UPSET AND NERVOUS.
22200 (21) DR.- WHAT TENDS TO MAKE YOU NERVOUS?
22300 (23) PT.- JUST BEING AROUND PEOPLE.
22400 (24) DR.- ANYONE IN PARTICULAR?
22500 In (17) and (19) the topic is the hospital. In (21) the topic
22600 changes to causes of the patient's nervous state.
22700 Topics touched upon previously can be re-introduced at any
22800 point in the interview. PARRY knows that a topic has been discussed
22900 previously because a topic-flag is set when a topic comes up.
23000
23100 META-REFERENCES
23200
23300 These are references, not about a topic directly, but about
23400 what has been said about the topic as in:
23500 (25) DR.- WHY ARE YOU IN THE HOSPITAL?
23600 (26) PT.- I SHOULDNT BE HERE.
23700 (27) DR.- WHY DO YOU SAY THAT?
23800 The expression (27 ) is about and meta to expression (26 ). The model
23900 does not respond with a reason why it said something but with a
24000 reason for the content of what it said, i.e. it interprets (27) as
24100 "why shouldn't you be here?"
24200 Sometimes when the patient makes a statement, the doctor
24300 replies, not with a question, but with another statement which
24400 constitutes a rejoinder as in:
24500 (28 ) PT.- I HAVE LOST A LOT OF MONEY GAMBLING.
24600 (29 ) DR.- I GAMBLE QUITE A BIT ALSO.
24700 Here the algorithm interprets (29 ) as a directive to
24800 continue discussing gambling, not as an indication to question the
24900 doctor about gambling.
25000
25100 ELLIPSES
25200
25300
25400 In dialogues one finds many ellipses, expressions from which
25500 one or more words are omitted as in:
25600 (30 ) PT.- I SHOULDNT BE HERE.
25700 (31) DR.- WHY NOT?
25800 Here the complete construction must be understood as:
25900 (32) DR.- WHY SHOULD YOU NOT BE HERE?
26000 Again, this is handled by the expectancy-anaphora list which
26100 anticipates a "why not".
26200 The opposite of ellipsis is redundancy which usually provides
26300 no problem since the same thing is being said more than once as in:
26400 (33 ) DR.- LET ME ASK YOU A QUESTION.
26500 The model simply recognizes (33) as a stereotyped pattern.
26600
26700 SIGNALS
26800
26900 Some fragmentary expressions serve only as directive signals
27000 to proceed, as in:
27100 (34) PT.- I WENT TO THE TRACK LAST WEEK.
27200 (35) DR.- AND?
27300 The fragment of (35) requests a continuation of the story introduced
27400 in (34). The common expressions found in interviews are "and", "so",
27500 "go on", "go ahead", "really", etc. If an input expression cannot be
27600 recognized at all, the lowest level default condition is to assume it
27700 is a signal and either proceed with the next line in a story under
27800 discussion or if a story has been exhausted, begin a new story with a
27900 prompting question or statement.
28000
28100 IDIOMS
28200
28300 Since so much of conversational language involves stereotypes
28400 and special cases, the task of recognition is much easier than that
28500 of linguistic analysis. This is particularly true of idioms. Either
28600 one knows what an idiom means or one does not. It is usually hopeless
28700 to try to decipher what an idiom means from an analysis of its
28800 constituent parts. If the reader doubts this, let him ponder the
28900 following expressions taken from actual teletyped interviews.
29000 (36) DR.- WHATS EATING YOU?
29100 (37) DR.- YOU SOUND KIND OF PISSED OFF.
29200 (38) DR.- WHAT ARE YOU DRIVING AT?
29300 (39) DR.- ARE YOU PUTTING ME ON?
29400 (40) DR.- WHY ARE THEY AFTER YOU?
29500 (41) DR.- HOW DO YOU GET ALONG WITH THE OTHER PATIENTS?
29600 (42) DR.- HOW DO YOU LIKE YOUR WORK?
29700 (43) DR.- HAVE THEY TRIED TO GET EVEN WITH YOU?
29800 (44) DR.- I CANT KEEP UP WITH YOU.
29900 In people, the understanding of idioms is a matter of rote
30000 memory. In an algorithm, idioms can simply be stored as such. As
30100 each new idiom appears in teletyped interviews, its
30200 recognition-pattern is added to the data-base on the inductive
30300 grounds that what happens once can happen again.
30400 Another advantage in constructing an idiolect for a model is
30500 that it recognizes its own idiomatic expressions which tend to be
30600 used by the interviewer (if he understands them) as in:
30700 (45) PT.- THEY ARE OUT TO GET ME.
30800 (46) DR.- WHAT MAKES YOU THINK THEY ARE OUT TO GET YOU.
30900 The expression (45 ) is really a double idiom in which "out"
31000 means `intend' and "get" means `harm' in this context. Needless to
31100 say. an algorithm which tried to pair off the various meanings of
31200 "out" with the various meanings of "get" would have a hard time of
31300 it. But an algorithm which recognizes what it itself is capable of
31400 saying, can easily recognize echoed idioms.
31500
31600 FUZZ TERMS
31700
31800 In this category fall a large number of expressions which, as
31900 non-contentives, have little or no meaning and therefore can be
32000 ignored by the algorithm. The lower-case expressions in the following
32100 are examples of fuzz:
32200 (47) DR.- well now perhaps YOU CAN TELL ME something ABOUT
32300 YOUR FAMILY.
32400 (48) DR.- on the other hand I AM INTERESTED IN YOU.
32500 (49) DR.- hey I ASKED YOU A QUESTION.
32600 The algorithm has "ignoring mechanisms" which allow for an
32700 `anything' slot in its pattern recognition. Fuzz terms are thus
32800 easily ignored and no attempt is made to analyze them.
32900
33000 SUBORDINATE CLAUSES
33100
33200 A subordinate clause is a complete statement inside another
33300 statement. It is most frequently introduced by a relative pronoun,
33400 indicated in the following expressions by lower case:
33500 (50) DR.- WAS IT THE UNDERWORLD that PUT YOU HERE?
33600 (51) DR.- WHO ARE THE PEOPLE who UPSET YOU?
33700 (52) DR.- HAS ANYTHING HAPPENED which YOU DONT UNDERSTAND?
33800 One of the linguistic weaknesses of the model is that it
33900 takes the entire input as a single expression. When the input is
34000 syntactically complex, containing subordinate clauses, the algorithm
34100 can become confused. To avoid this, future versions of PARRY will
34200 segment the input into shorter and more manageable patterns in which
34300 an optimal selection of emphases and neglect of irrelevant detail can
34400 be achieved while avoiding combinatorial explosions.
34500 VOCABULARY
34600
34700 How many words should there be in the algorithm's vocabulary?
34800 It is a rare human speaker of English who can recognize 40% of the
34900 415,000 words in the Oxford English Dictionary. In his everyday
35000 conversation an educated person uses perhaps 10,000 words and has a
35100 recognition vocabulary of about 50,000 words. A study of telephone
35200 conversations showed that 96 % of the talk employed only 737 words.
35300 (French, Carter, and Koenig, 1930). Of course if the remaining 4% are
35400 important but unrecognized contentives,the result may be ruinous to
35500 the coherence of a conversation.
35600 In counting all the words in 53 teletyped psychiatric
35700 interviews conducted by psychiatrists, we found only 721 different
35800 words. Since we are familiar with psychiatric vocabularies and
35900 styles of expression, we believed this language-algorithm could
36000 function adequately with a vocabulary of at most a few thousand
36100 contentives. There will always be unrecognized words. The algorithm
36200 must be able to continue even if it does not have a particular word
36300 in its vocabulary. This provision represents one great advantage
36400 of pattern-matching over conventional linguistic parsing. Our
36500 algorithm can guess while a traditional parser must know with
36600 certainty in order to proceed.
36700
36800 MISSPELLINGS AND EXTRA CHARACTERS
36900 There is really no good defense against misspellings in a
37000 teletyped interview except having a human monitor the conversation
37100 and make the necessary corrections. Spelling correcting programs are
37200 slow, inefficient, and imperfect. They experience great problems
37300 when it is the first character in a word which is incorrect.
37400 Extra characters sent over the teletype by the interviewer or
37500 by a bad phone line can be removed by a human monitor since the
37600 output from the interviewer first appears on the monitor's console
37700 and then is typed by her directly to the program.
37800
37900 META VERBS
38000
38100 Certain common verbs such as "think", "feel", "believe", etc.
38200 can take a clause as their ojects as in:
38300 (54) DR.- I THINK YOU ARE RIGHT.
38400 (55) DR.- WHY DO YOU FEEL THE GAMBLING IS CROOKED?
38500 The verb "believe" is peculiar since it can also take as
38600 object a noun or noun phrase as in:
38700 (56) DR.- I BELIEVE YOU.
38800 In expression (55) the conjunction "that" can follow the word
38900 "feel" signifying a subordinate clause. This is not the case after
39000 "believe" in expression (56). PARRY makes the correct
39100 identification in (56) because nothing follows the "you".
39200 ODD WORDS
39300 From extensive experience with teletyped interviews, we
39400 learned the model must have patterns for "odd" words. We term them
39500 such since these are words which are quite natural in the usual
39600 vis-a-vis interview in which the participants communicate through
39700 speech, but which are quite odd in the context of a teletyped
39800 interview. This should be clear from the following examples in which
39900 the odd words appear in lower case:
40000 (57) DR.-YOU sound CONFUSED.
40100 (58) DR.- DID YOU hear MY LAST QUESTION?
40200 (59) DR.- WOULD YOU come in AND sit down PLEASE?
40300 (60) DR.- CAN YOU say WHO?
40400 (61) DR.- I WILL see YOU AGAIN TOMORROW.
40500
40600
40700 MISUNDERSTANDING
40800
40900 It is perhaps not fully recognized by students of language
41000 how often people misunderstand one another in conversation and yet
41100 their dialogues proceed as if understanding and being understood had
41200 taken place.
41300 A classic example is the following man-on-the-street interview.
41400 INTERVIEWER - WHAT DO YOU THINK OF MARIHUANA?
41500 MAN - DIRTIEST TOWN IN MEXICO.
41600 INTERVIEWER - HOW ABOUT LSD?
41700 MAN - I VOTED FOR HIM.
41800 INTERVIEWER - HOW DO YOU FEEL ABOUT THE INDIANAPOLIS 500?
41900 MAN - I THINK THEY SHOULD SHOOT EVERY LAST ONE OF THEM.
42000 INTERVIEWER - AND THE VIET CONG POSITION?
42100 MAN - I'M FOR IT, BUT MY WIFE COMPLAINS ABOUT HER ELBOWS.
42200 Sometimes a psychiatric interviewer realizes when
42300 misunderstanding occurs and tries to correct it. Other times he
42400 simply passes it by. It is characteristic of the paranoid mode to
42500 respond idiosyncratically to particular word-concepts regardless of
42600 what the interviewer is saying:
42700 (62) PT.- SOME PEOPLE HERE MAKE ME NERVOUS.
42800 (63) DR.- I BET.
42900 (64) PT.- GAMBLING HAS BEEN NOTHING BUT TROUBLE FOR ME.
43000 Here one word sense of "bet" (to wager) is confused with the offered
43100 sense of expressing agreement. As has been mentioned, this
43200 sense-confusion property of paranoid conversation eases the task of
43300 simulation.
43400 UNUNDERSTANDING
43500
43600 A dialogue algorithm must be prepared for situations in which
43700 it simply does not understand. It cannot arrive at any interpretation
43800 as to what the interviewer is saying since no pattern can be matched.
43900 It may recognize the topic but not what is being said about it.
44000 The language-recognizer should not be faulted for a simple
44100 lack of irrelevant information as in:
44200 (65) DR.- WHAT IS THE FIFTIETH STATE?
44300 when the data-base does not contain the answer. In this default
44400 condition it is simplest to reply:
44500 (66) PT.- I DONT KNOW.
44600 When information is absent it is dangerous to reply:
44700 (67) PT.- COULD YOU REPHRASE THE QUESTION?
44800 because of the disastrous loops which can result.
44900 Since the main problem in the default condition of
45000 ununderstanding is how to continue, PARRY employs heuristics such
45100 as changing the level of the dialogue and asking about the
45200 interviewer's intention as in:
45300 (68) PT.- WHY DO YOU WANT TO KNOW THAT?
45400 or rigidly continuing with a previous topic or introducing a new
45500 topic.
45600 These are admittedly desperate measures intended to prompt
45700 the interviewer in directions the algorithm has a better chance of
45800 understanding. Although it is usually the interviewer who controls
45900 the flow from topic to topic, there are times when control must be
46000 assumed by the model.
46100 There are many additional problems in understanding
46200 conversational language but the description of this chapter should be
46300 sufficient to convey some of the complexities involved. Further
46400 examples will be presented in the next chapter in describing the
46500 logic of the central processes of the model.